92 research outputs found

    Crowdsourced real-world sensing: sentiment analysis and the real-time web

    Get PDF
    The advent of the real-time web is proving both challeng- ing and at the same time disruptive for a number of areas of research, notably information retrieval and web data mining. As an area of research reaching maturity, sentiment analysis oers a promising direction for modelling the text content available in real-time streams. This paper reviews the real-time web as a new area of focus for sentiment analysis and discusses the motivations and challenges behind such a direction

    An evaluation of the role of sentiment in second screen microblog search tasks

    Get PDF
    The recent prominence of the real-time web is proving both challenging and disruptive for information retrieval and web data mining research. User-generated content on the real-time web is perhaps best epitomised by content on microblogging platforms, such as Twitter. Given the substantial quantity of microblog posts that may be relevant to a user's query at a point in time, automated methods are required to sift through this information. Sentiment analysis offers a promising direction for modelling microblog content. We build and evaluate a sentiment-based filtering system using real-time user studies. We find a significant role played by sentiment in the search scenarios, observing detrimental effects in filtering out certain sentiment types. We make a series of observations regarding associations between document-level sentiment and user feedback, including associations with user profile attributes, and users' prior topic sentiment

    Classifying sentiment in microblogs: is brevity an advantage?

    Get PDF
    Microblogs as a new textual domain offer a unique proposition for sentiment analysis. Their short document length suggests any sentiment they contain is compact and explicit. However, this short length coupled with their noisy nature can pose difficulties for standard machine learning document representations. In this work we examine the hypothesis that it is easier to classify the sentiment in these short form documents than in longer form documents. Surprisingly, we find classifying sentiment in microblogs easier than in blogs and make a number of observations pertaining to the challenge of supervised learning for sentiment analysis in microblogs

    A study of inter-annotator agreement for opinion retrieval

    Get PDF
    Evaluation of sentiment analysis, like large-scale IR evalu- ation, relies on the accuracy of human assessors to create judgments. Subjectivity in judgments is a problem for rel- evance assessment and even more so in the case of senti- ment annotations. In this study we examine the degree to which assessors agree upon sentence-level sentiment anno- tation. We show that inter-assessor agreement is not con- tingent on document length or frequency of sentiment but correlates positively with automated opinion retrieval per- formance. We also examine the individual annotation cate- gories to determine which categories pose most di±culty for annotators

    On using Twitter to monitor political sentiment and predict election results

    Get PDF
    The body of content available on Twitter undoubtedly contains a diverse range of political insight and commentary. But, to what extent is this representative of an electorate? Can we model political sentiment effectively enough to capture the voting intentions of a nation during an election capaign? We use the recent Irish General Election as a case study for investigating the potential to model political sentiment through mining of social media. Our approach combines sentiment analysis using supervised learning and volume-based measures. We evaluate against the conventional election polls and the final election result. We find that social analytics using both volume-based measures and sentiment analysis are predictive and wemake a number of observations related to the task of monitoring public sentiment during an election campaign, including examining a variety of sample sizes, time periods as well as methods for qualitatively exploring the underlying content

    Sentiment analysis and real-time microblog search

    Get PDF
    This thesis sets out to examine the role played by sentiment in real-time microblog search. The recent prominence of the real-time web is proving both challenging and disruptive for a number of areas of research, notably information retrieval and web data mining. User-generated content on the real-time web is perhaps best epitomised by content on microblogging platforms, such as Twitter. Given the substantial quantity of microblog posts that may be relevant to a user query at a given point in time, automated methods are required to enable users to sift through this information. As an area of research reaching maturity, sentiment analysis offers a promising direction for modelling the text content in microblog streams. In this thesis we review the real-time web as a new area of focus for sentiment analysis, with a specific focus on microblogging. We propose a system and method for evaluating the effect of sentiment on perceived search quality in real-time microblog search scenarios. Initially we provide an evaluation of sentiment analysis using supervised learning for classi- fying the short, informal content in microblog posts. We then evaluate our sentiment-based filtering system for microblog search in a user study with simulated real-time scenarios. Lastly, we conduct real-time user studies for the live broadcast of the popular television programme, the X Factor, and for the Leaders Debate during the Irish General Election. We find that we are able to satisfactorily classify positive, negative and neutral sentiment in microblog posts. We also find a significant role played by sentiment in many microblog search scenarios, observing some detrimental effects in filtering out certain sentiment types. We make a series of observations regarding associations between document-level sentiment and user feedback, including associations with user profile attributes, and users’ prior topic sentiment

    DCU at the TREC 2008 Blog Track

    Get PDF
    In this paper we describe our system, experiments and re- sults from our participation in the Blog Track at TREC 2008. Dublin City University participated in the adhoc re- trieval, opinion finding and polarised opinion finding tasks. For opinion finding, we used a fusion of approaches based on lexicon features, surface features and syntactic features. Our experiments evaluated the relative usefulness of each of the feature sets and achieved a significant improvement on the baseline

    Combining social network analysis and sentiment analysis to explore the potential for online radicalisation

    Get PDF
    The increased online presence of jihadists has raised the possibility of individuals being radicalised via the Internet. To date, the study of violent radicalisation has focused on dedicated jihadist websites and forums. This may not be the ideal starting point for such research, as participants in these venues may be described as “already madeup minds”. Crawling a global social networking platform, such as YouTube, on the other hand, has the potential to unearth content and interaction aimed at radicalisation of those with little or no apparent prior interest in violent jihadism. This research explores whether such an approach is indeed fruitful. We collected a large dataset from a group within YouTube that we identified as potentially having a radicalising agenda. We analysed this data using social network analysis and sentiment analysis tools, examining the topics discussed and what the sentiment polarity (positive or negative) is towards these topics. In particular, we focus on gender differences in this group of users, suggesting most extreme and less tolerant views among female users

    Topic-dependent sentiment analysis of financial blogs

    Get PDF
    While most work in sentiment analysis in the financial domain has focused on the use of content from traditional finance news, in this work we concentrate on more subjective sources of information, blogs. We aim to automatically determine the sentiment of financial bloggers towards companies and their stocks. To do this we develop a corpus of financial blogs, annotated with polarity of sentiment with respect to a number of companies. We conduct an analysis of the annotated corpus, from which we show there is a significant level of topic shift within this collection, and also illustrate the difficulty that human annotators have when annotating certain sentiment categories. To deal with the problem of topic shift within blog articles, we propose text extraction techniques to create topic-specific sub-documents, which we use to train a sentiment classifier. We show that such approaches provide a substantial improvement over full documentclassification and that word-based approaches perform better than sentence-based or paragraph-based approaches

    Land use strategies of the ancient Maya in seasonally dry tropical forest ecosystems of the Yucatan Peninsula

    Get PDF
    Throughout the history of human-environmental interactions in Central America, the ancient Maya are one of the most contested regarding the extent to which their land-use strategies degraded their environment. For over 3500 years, the ancient Maya manipulated plant communities by promoting economically important species and removing those that had little use. These strategies potentially impacted the modern forests of Central America, by creating a legacy of economically important species in the modern assemblages. Along with the promotion of useful species, the ancient Maya also consistently introduced fire into ecosystems that would have limited natural exposure and to some extent removed forest vegetation for settlement structures. It is this extent of forest removal that remains one of the most contentious aspects of our understanding of ancient Maya land-use strategies. Palaeoecological records (fossil pollen evidence) throughout Central America shows a strong signal for extensive forest cover removal (declining arboreal pollen) and maize agriculture, leading many to suggest these processes were closely related to population pressures and food demand. These signals for supposed deforestation have been added to a pre-existing link between climate drying and societal collapse (ca 750-1100 CE), leading many to suggest that extensive environmental degradation was one of the major drivers of the collapse of the Classic Maya Civilisation. Whilst the evidence for intensive drought is founded in robust palaeoclimate records throughout Central America and the evidence for the societal decline is well documented across many settlements throughout the region, the evidence for deforestation is not yet as well established. To date, the majority of records that interpret these phases of deforestation are located in assumed high population density centres. These sites are then often extrapolated to the entirety of the ancient Maya society, resulting in little attention being paid to how different types of settlements may have interacted with the forest environment. Here we show two new palaeoecological investigations from lower-density settlements, with one being the first palaeoecological representation of ancient Maya land-use from an island site. Pollen and charcoal records were used to determine changes in vegetation and the fire regime associated with ancient Maya land-use from the seasonally dry tropical forested ecosystems of the Yucatan Peninsula. Comparisons between an inland (Laguna Esmeralda) and island (Ambergris Caye) reveal similarities regarding the extent in which the forest was impacted by periods of cultivation, but also differences regarding how activities changed in response to periods of drought. This research presents two new chronological baselines for Zea mays cultivation on the mainland (5.5 kyr cal. BP) and the island (4.8 kyr cal. BP) showing these regions were actively managed long before previously suggested. In addition to the long-term records of ancient Maya land-use, a series of surface samples from Laguna Esmeralda and the adjacent Lake Chichancanab to uncover how the modern forest is represented in these two different sized lakes and aid in the interpretation of the palaeoecological records. Using these interpretations of ancient Maya land-use from lower-density settlements, this research shows that the aforementioned hypothesis of extreme environmental degradation likely only represents a perspective from higher-density settlements. The strong associations between periods of land-use and drought conditions are prominent during the Terminal Classic Period (750/1000 CE), where clear reductions in arboreal pollen are interpreted to reflect localised forest clearances and intensification of cultivation around a valuable water resource. Further adaptions to drought periods are also evident from Ambergris Caye, with combined previous archaeological and current palaeoecological evidence showing the use of mixed resources during the Preclassic Abandonment Period (~250 CE). Ambergris Caye acted as a climate refuge for the ancient Maya, providing a new lens of analysis for understanding ancient Maya adaptions to instability and showing the importance of island sites in the wider perspective of the ancient Maya civilisation
    corecore